Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve variables order in Dataset.concat() #1049

Merged
merged 3 commits into from
Nov 2, 2016
Merged

Conversation

fmaussion
Copy link
Member

Alternative (and much easier) implementation to #1048

The problem with the coordinate variables remains the same

data_vars = list(ds.data_vars.keys())
self.assertEqual(data_vars, data_vars_ref)
# coords are now at the end of the list, so the test below fails
# self.assertEqual(all_vars, all_vars_ref)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think this test passes in this version?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still doesn't pass on my linux machine :(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, concat does not currently preserve the order of all variables (yet). But at least we can make it non-stochastic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could probably do this by consolidating the two loops over all variables. But up to you if you want to go through the trouble.

@shoyer
Copy link
Member

shoyer commented Oct 15, 2016

Agreed, this is the better approach. Thanks for diving into this!

@shoyer
Copy link
Member

shoyer commented Oct 15, 2016

Alternatively, it looks like the issue is on line 266 where we iterate for k in concat_over. Instead, we could just do:

for k in variables:
    if k in concat_over:
        ...

This has the advantage of preserving the set, which is important if there is a very large number of variables.

@fmaussion
Copy link
Member Author

Ok, done. The second problem about the coordinate variable seems trickier to solve though...

@fmaussion fmaussion changed the title WIP2: preserve variables order in Dataset.concat() Preserve variables order in Dataset.concat() Oct 15, 2016
@fmaussion
Copy link
Member Author

You could probably do this by consolidating the two loops over all variables. But up to you if you want to go through the trouble.

@shoyer could you elaborate on that? If there's a way to make it right I am happy to try, but I'm afraid I'm not sure what is the way to go.

@shoyer
Copy link
Member

shoyer commented Nov 2, 2016

OK, I'm just going to merge this. We can iterate on this more later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants